An Investigation of Signal Processors for Recognition

نویسندگان

  • Mari Ostendorf
  • John P. Kaufhold
چکیده

Currently, in uncorrupted acoustic environments, a state-of-the-art, continuous speech, 5000word vocabulary speech recognizer can achieve a recognition accuracy of 95%. In a telephone channel acoustic environment, the same recognition system can double in error rate, which is unacceptable for most applications. It is the goal of this project to quantitatively measure how well a speech recognition system recognizes natural number utterances (e.g. \four thousand lira") over a telephone line. The performance of recognizers will be quanti ed using word error, which is a measure of the di erence between the actual transcription of a speech waveform and its recognized transcription. This project will compare two signal processing algorithms used in recognition of telephone speech. One standard, widely accepted signal processing algorithm, cepstral mean subtraction, motivated by computational e ciency and high performance, was evaluated. Another newer signal processing algorithm, RASTA, motivated primarily by auditory physiology was compared to cepstral mean subtraction in terms of word error. The two algorithms performed comparably in tests. CMS produced a 22% word error and RASTA produced a 25% word error on this di cult task. This project is signi cant from a biomedical point of view because telephone speech recognizers can be used to aid the handicapped. For example, the deaf could use a speech recognition system to aid them in their telephone communication with hearing people.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation and Critical Investigation on Modulation Schemes of Three Phase Impedance Source Inverter

New control circuits and algorithms are frequently proposed to control the impedance (Z) source inverter in efficient way with added benefits. As a result, several modified control techniques have been proposed in recent years. Although these techniques are clearly superior to the simple boost control method which was initially proposed along with the Z-source inverter (ZSI), little or conf...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

Implementation of Face Recognition Algorithm on Fields Programmable Gate Array Card

The evolution of today's application technologies requires a certain level of robustness, reliability and ease of integration. We choose the Fields Programmable Gate Array (FPGA) hardware description language to implement the facial recognition algorithm based on "Eigen faces" using Principal Component Analysis. In this paper, we first present an overview of the PCA used for facial recognition,...

متن کامل

طراحی و ساخت یک سیستم تشخیص خواب آلودگی راننده مبتنی بر پردازش‌گر سیگنال TMS320C5509A

Every year, many people lose their lives in road traffic accidents while driving vehicles throughout the world. Providing secure driving conditions highly reduces road traffic accidents and their associated death rates. Fatigue and drowsiness are two major causes of death in these accidents; therefore, early detection of driver drowsiness can greatly reduce such accidents. Results of NTSB inves...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995